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The degradation suffered when pulses satisfying the Nyquist criterion 
are used to transmit binary data in noise at supraconventional rates is 
studied. Optimum processing of the received waveforms is assumed, and 
attention is focused on the minimum distance between signal points as a 
performance criterion. An upper bound on this distance is given as a 
function of signaling speed. In particular, the pulse energy seems to be 
the minimum distance up to rates of transmission 25 percent faster than 
the Nyquist rate, but not beyond. 

Some mathematical aspects related to the above problem are also con- 
sidered. In particular, the minimum distance is rigorously shown to be 
nonzero for all transmission rates. This is tantamount to showing that, 
in the singular case of linear prediction, perfect prediction cannot be 
approached with bounded prediction coefficients. 

I. INTRODUCTION 

The use of Nyquist pulses 

sin (rt/T) 



g(t) = 



(rt/T) 



to send binary (or multilevel) data without intersymbol interference 
over a channel of bandwidth W = (l/2T)Hz is classic. If we assume 
that one receives the pulse train 

AT, 

u(t) = X! a»g(t — nT), a n = ± 1, independently, (1) 

n=A'l 

in additive white gaussian noise of two-sided spectral density No/2, 
then the optimum detector has a bit-error rate P e given by 



p.-q(^), 



\jwj ' (2) 
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where 

Q(x) = -|= f" e-""dy m \ erfc ^ , (3) 

erfc ( • ) denoting the co-error function, and E being the energy in the 
pulse g(t). In our case, E = T. Asymptotically, for large signal-to-noise 
ratios, (2) becomes 

We now address the following question : Suppose that in transmitting 
(1) we obtain a performance from (2) that is more than satisfactory. 
Thus, we may have a P e of lO" 6 or 10~ 7 when 10" 6 would be adequate. 
To what extent can we trade this "excess performance" for speed by 
replacing T by T' < T in (1), while keeping transmitted power 
constant? In other words, we still use pulses 

. . _ sin (irt/T) , K ., 

but send them at intervals T < T. We call this faster-than-Nyquist 
transmission and shall characterize T' by writing T' - pT, < p < 1. 
A particular motivation for this problem is to mathematically model, 
in a simple way, what would happen if voice-band telephone channels 
are "pushed" to their limits with more rapid transmission of pulses 
than has been conventional. 

While simple detectors that match filter and sample can still be 
used for faster-than-Nyquist transmission, their performance is 
suboptimum. 1 We are concerned here with optimum detectors. Since 
exact analysis of nonlinear detectors is not presently feasible, we 
choose to give our detectors the benefit of the doubt and work rather 
with lower bounds to P e . Nevertheless, interesting results can be 
obtained regarding the trade-off considered here. To see why degrada- 
tion in error rate is inevitable, note that (2) is the well-known matched 
filter bound for antipodal pulses, each of energy E, which must bound 
performance for bit detection with a sequence of (perhaps interfering) 
pulses. On the other hand, as T' decreases, pulses are sent faster 
and the energy E in each pulse must be decreased in direct proportion 
so that the power E/T' is kept constant. This is an immediate un- 
avoidable element in performance degradation, and may be regarded 
as a "fair" trade-off. Another cause of degradation is the degree to 
which the optimum detector can cope with the interference among 
pulses, i.e., the fact that the performance will drop below that of (2). 
Here, bounds other than (2) are useful, and in fact are the first item 
taken up in the next section. 
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II. DISCUSSION OF LOWER BOUND FOR ERROR RATE 

Assuming (1) is received in white noise and an optimum detector is 
used for detecting the kth bit, a lower bound on the chance of making 
an error on this kth bit will now be derived. Since the data a„ are in- 
dependent, this bound also serves for any sequence (1) starting at 
n = N[ ^ N\, and ending at n = N' 2 ^ N 2 . We begin with the fact 
that, for a binary hypothesis problem with equal a priori probabilities 
and having p+(x) or p~(x) as the two probability densities of the 
received signal x under the two respective hypotheses, one way 2 to 
write the probability of error is 

P* = g / min [>+(*)> P-(x)]dx. (6) 

If we let ut(t) be a particular one of the equiprobable 2- v signals in 
(1), N = N 2 — Ni, which have ±1 in the kth position, then formally 

P±(«) = 9at L P±(x), (7) 

where p±(x) is the density of the observations conditioned on the 
entire sequence. Thus, 

11/" / 2 " 2A ' • \ 

Pe = 22"' J min V? P+^' ? P "^ ) dx 

E f mm\y+(.x),p^(x)2dx. (8) 



= 2 2 N 



In writing (8), we have made use of the fact that the minimum of two 
sums with an equal number of terms is at least as large as the sum of 
the minimum of the two tth terms of each series. Of course, each series 
can be arranged in any permuted order before the pair-wise minimum 
is taken and, thus, the pairings i with j(i) are indicated in (8) to allow 
for this permutation. Now 

iy min \y+(x), p**> (*)](& (9) 

is the probability of error with two fixed signals and has the well-known 
evaluation 

i d WY 

where 

d 2 {i,j) = f X [l4(«) - uL{t)Jdt (11) 

J — oo 

is the "distance" between two sequences (1) which differ in the fcth 
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position. Equation (8) then reads 

for any set of pairings [i,j(i)2- Tne bound (12) is intimately related 
to Forney's lower bound, 3 although our derivation is quite different. 
Forney's bound in the present situation reads 

'•**»(&)■ (13) 

where d min is the minimum distance between signals (1) which differ 
in the Mh position, and p m is the probability that a sequence chosen 
at random has a sequence with opposite polarity in the fcth position 
at distance d min . Equation (12) can be made to yield something like 
(13). Thus, in (12) discard all terms except for those pairings [i,j(i)2 
such that d\_i,j(i)~] = d . Then (12) implies 

no. of pairings Q ( d \ ,.., 

The coefficient in front of the Q function corresponds to the proba- 
bility coefficient in (13). Choosing d = d min yields (13), but when 
we will not be able to find d mia , eq. (14) will serve our purpose. 

III. ESTIMATING THE MINIMUM DISTANCE 

Clearly, in (14) we should like to find the smallest d Q to maximize 
the lower bound, provided the coefficient is not too small. In our 
problem, rf^m is given by 

2 



J2 1 rpv I N 

^= inf i-f l-£a*« 

4E N: (a,= ±1.0l 4TP J -or 1 = 1 



dO, (15) 



where we have normalized by dividing by the pulse energy E. The 
expression (15) comes from taking the Fourier transform of (11) and 
manipulating the resulting expression slightly. We note particularly 
that in (15) only positive values of I need be considered, since 

(M \ 12 

1 - ,£ K a *")\ = 



M+K 

1 - E b t e iU 

i = \ 



9/ 1 — 

1*0 

if a- K 9* 0. We have set b t = - a- K a,- K til 9* K and b t = — a- K if 
I = K. 

We cannot claim to have found the minimum value of (15). How- 
ever, a simple numerical effort has yielded the results for dl/iE shown 
in Fig. 1, where d refers to the smallest distance we have found. We 
note in particular that do is the pulse energy for p decreasing from 1 
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Fig. 1 — The smallest, distances between signal sequences that we have found are 
shown here for different values of signaling rate. Labeling a point by K indicates 
that the polynomial is 

p(z) = 1 + f (-l)'z'. 



to 0.8, or, in other words, for rates exceeding the Nyquist rate by 
25 percent [percentage of excess = 100(l/p — 1)]. Thus, dmm/4 
cannot be the pulse energy for p < 0.8 for this problem. By the time 
p has decreased to 0.5, d\/\E has dropped to 0.465. (G. J. Foschini 
has informed the author that the use of the polynomial p(z) = 1—2 
-+■ z 3 — z* -\- z 6 — z 7 , z = exp (id), results in the value 0.410 for 
do/42? at p = 0.5.) Except for some points in the neighborhood of 
p = 0.4, the values for dl have been obtained by considering numeri- 
cally the best value of K which minimizes, for not too large K, 

1 f T I K 

oM l+LC-DV 

^P J-pr \ 1 = 1 

These points are labeled with the appropriate value of K in Fig. 1. 



dd. 



(16) 
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Somewhat surprisingly, the larger values of K are responsible for 
decreasing d initially (K = 7 at p = 0.8), and then K gradually 
becomes smaller (K = 2 at p = 0.5). The value obtained with K = 1 
always was suboptimum, as was the limiting value of (16) when 
K — * » , which is easily shown to be 

-tan^- (17) 

xp Z 

Why were the sequences given in (16) deemed to be of interest in 
the first place? The most interesting reason stems from the following 
argument. If one considers the Fourier transform of a doubly infinite 
pulse sequence like (1) when pulses are being sent faster than Nyquist 
and when the special case of the alternating sequence a„ = ( — 1)" is 
being sent, one finds that the Fourier transform consists of delta 
functions spaced at all odd multiples of t/T', that is, the Fourier 
transform is out-of-band, which suggests zero received energy. Ac- 
tually, the doubly infinite model and its 5-function Fourier transforms 
are idealizations representing limiting behavior for signals consisting 
of pulses extending from (— N, N) and N becoming large. We are 
really concerned with limiting behavior of the energy contained in 
the frequency interval (— v/T, v/T), with T > T', and evidently for 
the present case, if Sn(u) is the Fourier transform of the truncated 
pulse sequence, 

lim^ /"' f |£Ar(co)| 2 dco * <r [*' \UmSir(o)\*du = 0. (18) 

#_.o27r J-r/T *T J-r/T 

In spite of the above subtlety, however, sequences which are alter- 
nating at least over part of their range are interesting and one might 
expect difficulty distinguishing between one such sequence and its 
negative. 

In addition to the normalized distances given in Fig. 1, Fig. 2 plots 
the numerical values of lower bounds computed from expression (14), 
as well as the matched filter bound. These curves all assume constant 
power. Curves with initial (p = 1) error rates with 10 -6 and 10 -7 are 
chosen as examples in Fig. 2. In both cases, an order of magnitude of 
degradation in error rate is seen for a 25-percent increase in bit rate 
(p = 0.8) using only the matched filter bound. Decreasing p further 
on the 10 -7 curve illustrates further degradations using (14) with an 
appropriate value of K. These bounds do not show a departure from 
the matched filter bound for as small a value of p as Fig. 1 would 
suggest, because the coefficient 1/2 X to be used in (14) swamps the 
effect of the decreasing "minimum" distance. For the 10 -5 curve, 
this effect extends to even smaller p and no lower bound other than 
the matched filter one is shown for that case. 
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Fig. 2 — Lower bounds on error rate vs signaling speed for two initial (p = 1) 
cases. The solid curves are both matched filter bounds. The dashed curve is based 
on minimum-distance considerations and applies to the 10 -7 case. All curves are 
drawn for constant power. 

IV. TWO MATHEMATICAL QUESTIONS 

As we have already emphasized, the infimum of the right member 
of (15) over all the indicated trigonometric polynomials with ±1,0 
coefficients is not displayed in Fig. 1. Figure 1 simply shows the 
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smallest values we have found. Next, we want rigorously to establish 
here that d^ ^ if p ^ 0. Note that this would not be the case if 
the coefficients a,i in (15) were allowed to be any real numbers. In 
fact, for any nonnegative function f(8) with In f(6) £ L x { — w, ir), 
we have the Szego theorem 4 which states 



inf 

N; at real 



ll> 



1 - E Oie« 



dd 



= exp~J^\nf(d)d9. (19) 



Expressions such as (19) occur, in particular, in linear prediction 
theory. 

In our case, f(9) = if \8\ > pv and ln/(0) is not L\, but the 
appropriate limit of (19) indicates zero to be the infimum, which is 
the correct answer. 4 Thus, there is some cause to wonder if d^ ln as 
denned in (15) is zero as well. We shall in fact show it is slightly more. 

Theorem 1 : Let be any positive (finite) real number and require \ai\ ^ /3, 
I = 1,2, ■■■. Then 



inf E /' 

N; |oi) 4* J- 



1 - E a t e il 

pr | 1 



dd > 0, p^O. 



(20) 



Proof: We first note that if there exists a sequence {pn(0)}"=i of 
trigonometric polynomials of the form 



such that 



P„(0) = Etti(»)«*. I ail ^ /S < °° 



&/_" '* -P»WI W - >0 » 



(21) 



(22) 



then, for any G(6) £ L^ — pic, pir)? 

f^ G(d)p n (6)d8 -> ['* 0(d)dd. (23) 

J —pr J —pr 

This is simply a statement of the fact that if p n (0) converges strongly 
to unity, it also converges weakly to unity. Now it is easy to see from 
(23) and the form of p„ (0) that 



E I f* d0e' n9 G(0)| ^ I f* <?(0)d0 

1 | J—pT | J —pr 

Or, in other words, if 

r* G(0)d0 

./— px 



< sup — 

G(«)eLi(-pir,pT) y> 



/"" e int G(B)dd 

J -pr 



(24) 



(25) 



+ In addition to G(0) GL 2 (-pt,pt) it will sometimes be convenient to regard 
G(6) £ Li( — ir, it) but having support confined to (— p*-, px). 
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then (22) cannot be true. In particular, if (25) holds with j8 ^ 1, then 
dl^ is strictly positive. Regarding G{6) G L 2 (-t, t) but supported 
on [_ — pir, pir], and calling 

g(t) -^/j'^W*. (26) 

g n = g(n), 
the right member of (25) contains the quantity 



I ffo I 



E Iff- 

i 



(27) 



Clearly, we have a question concerning the sample values g n at the 
nonnegative integers of a function whose bandwidth is strictly less 
than ir. Normalizing (27) with g = 1, (25) prompts the question: 
How small can £T \g„ | be? If it can be zero, then (25) would be true 
for any finite j8. In fact, by Carlson's lemma, 5 which states that a 
band-limited function having a bandwidth less than t is uniquely 
determined by its sample values taken at integers along a half line, it 
follows that if g = 1, then £f \g n \ 7*0. But Carlson's lemma does 
not say that £"|ff»| cannot be made arbitrarily small under these 
conditions. Lemma 1 (see below) shows that £r|ff«l can be arbi- 
trarily small. Thus, the right member of (25) is infinity, implying the 
truth of Theorem 1 . 

An immediate corollary of Theorem 1 is that for the singular case 
of Szego's theorem [J (6) vanishing on an interval] the infimum value 
of zero cannot be approached without using unbounded coefficients. 

Lemma 1: Let g(t) [riot identically zero and €z L 2 (— <*>, <»)] have 
Fourier transform G{6) supported on { — pir, pir) for some fixed p, < p 
< 1. Denote the samples of g(t) at the integers by g n [as in eq. (26)], and 
fix the normalization of g(t) by setting \g \ = 1. Then 

inff 1 0,| =0, (28) 

i 

where the infimum is taken over all g(t) having the indicated properties. 

Proof: We begin with the simple, but crucial, remark that it is suffi- 
cient that there be, for any p, a function h(t;p) GL 2 (-°°) °°) whose 
Fourier transform is supported on ( — pir, pir), such that h(0, p) = 1 
and such that £" \h n (p) | 2 can be arbitrarily small. 1 " This is sufficient, 



f We are grateful to H. J. Landau for pointing this out. Landau has also supplied 
an independent proof of the above refinement to Carlson's lemma, which we give in 
the appendix. 
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because to make (27) large (for some fixed value of p) we would just 
need to take 



9 



(<)-*(*,§) (29) 



for an appropriate h(t, p/2). Clearly, git) is band-limited to p and is 
L 2 (— oo, co) because h(t, p/2) is bounded: 

*('■ 1) - s CI H(e)M s s("- C lH(e) '*)■ m 

But can we really find an appropriate h (t) such that 

h = 1, £ |A„| 2 < e, (31) 

i 

or, equivalently, can we find a real h(t), band-limited to (— pir, pic), 
such that 

(ho ~ l) 2 + E tf < e? (32) 

i 

Indeed we can, and in fact the answer may be extracted from an 
article by Salz 6 which discusses mean-square decision feedback equali- 
zation. Salz, in Section V of his paper, considered the equalization 
problem for faster-than-Nyquist signaling. His minimization problem 
was of the form in (32) plus an added term for the noise variance; 
h(t) corresponds to the output of the equalizer when one pulse of the 
form sin pirt/pirt is the input. He remarks, in the last sentence on 
page 1354 of his paper, that the quantity that corresponds to (32) 
plus added output noise variance goes to zero as the input noise 
variance decreases. Hence, if we choose hit) to be the output pulse 
of a decision-feedback equalizer whose taps have been optimized for 
the case of sufficiently small input noise, then (32) will be sufficiently 
small. Thus, Lemma 1 is proven. 

The second question we discuss in this section is the rapidity with 
which the minimum distance decreases as p approaches zero. We 
develop this in Theorem 2. 



Theorem 2: 



lim ^4^- = M any k > 0. (33) 



Proof: The proof is a simple construction. Consider the polynomials 



i 



Pl(z) = n (1 - z 2 '). (34) 



(=0 



Clearly, Pl(z) has a zero of order (L + 1) at z — + 1, and has ±1 
coefficients, with P L (0) = + 1. Now, for small p, the (L + l)st order 
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zero at 2 = 1 implies 

\P L (e iB )\ 2 d6 = 0( P 2L + 3 ) (35) 



2tt ;_ pt 



for all integer L. Equation (33) follows immediately. 

Short of finding d^m exactly, there arc a few mathematical questions 
that suggest themselves and that may be less difficult than the full 
problem. Thus, Fig. 1 prompts one to ask if there is a neighborhood 
of p = 1, where cft^/A is the pulse energy? Another question has to do 
with pulse design. Given that G{6) is symmetric, positive, L 2 , and 
supported on ( — pir, pir), is G(6) = constant the best choice to maximize 
the minimum distance (subject to fixed pulse energy)? 
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APPENDIX 
Landau's Proof 

In Section IV we present another proof that 

sup ^5 = oo, (36) 

E lffn| 2 
n = l 

where the sup is taken over all g (t) £ L 2 ( — » , oo ) , which are band- 
limited to ( — pw, pir). Our proof in the text relied on the published 
results of work by Salz. 6 Here we give a self-contained, but more 
mathematical, proof of (36) which was developed by H. J. Landau. 
Suppose (36) is not true, i.e., suppose that 



Then, 



£ \g n \ 2 ^ g -r > for all g{t) of BW = P tt. (37) 



Iffol 2 ^ kZ \9n\ 2 - (38) 

i 



From Carlson's lemma, go is a linear functional on the 1 2 sequence 
\gi> 02, ■ ■ ■ , gk, • • • } and, from (38), this linear functional is bounded. 
Therefore, by the standard Riesz representation 1 for bounded linear 



* Not all h sequences |j/,( give rise to an appropriate g(t), and hence, the linear 
functional g is not defined on all of 1 2 . Therefore, before using the Riesz theorem, 
the Hahn-Banach theorem should be invoked to extend g to a bounded linear func- 
tional on all of I2. 
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functionals, we may write 

go= ib,g v , ± 6, 2 < co, (39) 

i i 

where the 6, do not depend on g (t) . We now consider the function 

Viz) = 1 - t b n z n , (40) 

i 

which is analytic for \z\ < 1. For any (7(0) £ L 2 (—pT, pic), we may 
write, using (39), 

r G(e)dd = £ b n \" T e inB 0{d)de 

J — pir 1 J —pr 

= /_"' (f b n e i AG(d)d6. (41) 

Therefore, 

lim / P * f 1 - E M'VW* = ° ( 42 ) 

i Z i-i y~ P x \ i / 

for all G(6) G L 2 ( — pT, p7r). + By the completeness of L 2 , we must have 
1 — E" & B e in ' = a.e. on (—pr, pir). Since the radial limit of the H* 
function p(z) vanishes on a set of positive measure, p(z) itself must 
vanish for \z\ < 1. (See Ref. 7, p. 373, Theorem 17.18.) However, 
p(0) = 1, and, hence, we have a contradiction, denying the validity 
of (37). 
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